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Abstract — A new straightforward universal blind detection 
algorithm for linear Gaussian channel with ISI is given. A new 
error exponent is derived, which is better than Gallager's random 
coding error exponent. 



I. Introduction 

In this paper, the discrete Gaussian channel with intersym- 
bol interference (ISI) 



3=0 



(1) 



will be considered, where the vector h = (ho, hi, . . . , hi) 
represents the ISI, and {z{\ is white Gaussian noise with 
variance a 2 

A similar continuous time model has been studied in Gal- 
lager [6]. He showed that it could be reduced to the form 



(2) 



where the v n are eigenvalues of the correlation operator. The 
same is true also for the discrete model ([T), but the reduc- 
tion requires knowledge of the covariance matrix R(i,j) = 
52fe=n — i)h(k — j) whose eigenvectors should be used as 
new basis vectors. 

Here however such knowledge will not be assumed, our 
goal is to study universal coding for the class of ISI channels 
of form £j]). 

As a motivation , note that the alternate method of first 
identifying the channel by transmitting a known "training 
sequences" has some drawbacks. Because the length of the 
training sequence is limited, the estimation of the channel can 
be imprecise, and the data sequence is thus decoded according 
to an incorrect likelihood function. This results in an increase 
in error rates [2], [3] and in a decrease in capacity [4]. As the 
training sequence contains no valuable information, the longer 
it is the less information bits can be carried. 

One can think this problem could be solved by choosing the 
training sequence sufficiently large to ensure precise channel 
estimation, and then choose the data block sufficiently long, 
but this solution seldom works due to the delay constraint, and 
to the slow change in time of the channel. 

So we will give a straightforward method of coding and 
decoding, without any Channel Side Information (CSI) To 
achieve this, we generalise the result of Csiszar and Korner 
[5] to Gaussian channel with white noise and ISI, using an 
analogue of the Maximal Mutual Information (MMI) decoder 
in [5]. 

Thanks for Imre Csiszar for his numerous correction in this work. 



We will show that our new method is not only universal, 
not depending on the channel, but its error exponent is better 
in many cases than Gallager's [6] lower bound for the case 
when complete CSI is available to the receiver. Previously, 
Gallager's error exponent has been improved for some chan- 
nels, using an MMI decoder, such as for discrete memoryless 
multiple-access channels [7]. 

We don't use Generalised Maximum likelihood decoding 
[9], but a generalised version of MMI decoding. This is done 
by firstly approximating the channel parameters by maximum 
likelihood estimation, and then adopt the message whose 
mutual information with the estimated parameters is maximal. 
By using an extension of the powerful method of types, we 
can simply derive the capacity region, and random coding 
error exponent. At the end, we have a more general result, 
namely: We show how the method of types can be extended 
to a continuous, non memoryless environment. 

The structure of this correspondence is as follows. In 
Section [TT] we generalise typical sequences to ISI channel 
environment. The main goal of section [III] is to give a new 
method of blind detection. In section[IV]we show by numerical 
results, that for some parameters the new error exponent 
is better than Gallager's random coding error exponent. In 
Section [V] we discuss the result, and give a general formula 
to the channels with fading. 

II. Definitions 

Let 7n be a sequence of positive numbers with limit 0. The 
sequence x G K™ is j n -typical to an n-dimensional continuous 
distribution P, denoted by x G Tp, if 

\-log(p(x))-H(P) 



< In 



(3) 



where p(x) denotes the density function, and H(P) the 
differential entropy of P. 



Similarly sequences x G M. n ,y G 



are jointly 7„ -typical 



to 2n + 1 dimensional joint random distribution Px,y denoted 
by (x,y) G T PxY , if 

\-log(px,Y(y,X)) - H(jpx,Y)\ < «7n 

In the same way, a sequence y is 7„-typical to the con- 
ditional distribution Py\x< given that X — x, denoted by 

v e Tp Y{x (x) if 

\-log(p Y \x(yk)) - H{Y\X)\ < n ln 

For simplifying the proof, in the following, Px = P is 
always the n dimensional i.i.d. standard normal distribution, 
the optimal input distribution of a Gaussian channel with 



2 



power constrain 1 . The conditional distribution will be chosen 
as P^?\x w i m density 



-\y-h*x\ 2 



cr n (27r)(™ +i ")/ ; 



where h = (ho, h\,hi, . . . , hi) and (h * x)i = J2j=o x i-3 h j 
where Xk = is understood for k < 0. So in this case 

H Ka {Y\X) = H(h*X+Z\X) = H(Z) = n[ ln( ^ +ln(a)] 



The limit of the entropy of Y = X * h + Z as n — > oo, is 

\\hA Y ) _ 1 1 



lim 



n — >oc ft 



where /(A) = Efel-«>(£j=o ' h j h i+\k\) eikX see ^ ( here 
Rm.n = r(m — n) = r(k) = Ej='o fe ' hjhj+\k\ * s me correlation 
matrix). So the limit of the average mutual information per 
symbol, that is 



lim I n (h,<j) = lim 

n — ►oo n — >oc 

is equal to 



moreover the sequence I n (h,a) is non-increasing (see [1]). 

We will consider a finite set of channels that grows subex- 
ponentially with n, and in the limit dense in the set of all ISI 
channels. To this end, define the set of approximating ISI, as 

U n = {he R ln : hi = k lln , \h\ < P n , h G Z, 

Vie{l,2,...,U} 

where l n is the length of the ISI, P n is the power constraint per 
symbol, and j n is the "coarse graining", intuitively the preci- 
sion of detection. Similarly we define the set of approximating 
variances as 

V n = {a e R : a = fc 7 „, 1/2 < \a\ < P n , k E Z+ 

ViG{l,2,...,U} 

These two sets form the approximating set of parameters, 
denoted by S n = H n x V„. 

Below we set l n = [log 2 (ri)], P n = nio, 7„ = n" 



1 Definition The ISI type of a pair (x, y) € 
the pair (h n , a n ) £ S n defined by 



on v m>n+l„ 



III. Lemmas, Theorem 

We summarise the result of this section: The first Lemma 
shows that the above definition of ISI type is consistent, 
in the sense that y is conditionally P Y '\x typical given x, 
at least in the case when \\y — h * x\\ 2 is not too large. 
Lemma |2] gives the properties which we need for our method, 
and proves that almost all randomly generated sequences has 
these properties. Lemma |4] gives an upper bound to the set 
of output signals, which are "problematic" thus typical to 
two codewords, namely they can be result of two different 
codeword with different channel. Lemma [5] shows that if the 
channel parameters estimated via maximum likelihood (ML), 
the codewords and the noise cannot be very correlated. Lemma 
[6] gives the formula of the probability of the event that an 
output sequence is typical with another codeword with respect 
to another channel. All Lemmas are used in TheoremQ] which 
gives the main result, and defines the detection method strictly. 



Lemma 1 

When 



\\y-h(j)*Xi\\ l 



so the detected variances is in the interior of the set of 
approximating variances, then 

y € T p h(<),<rco (Xi) 

— r Y\X 

Proof: Indeed, if 



*(*) I < 7n 



then 



With 



\\y - h{i) *xj| 2 n 



2a{if 



■\og(P^^(y\x i ))^l\og(2^(i) 2 ) + 



we get 



| -log(PS Wi) («)- H pHi)Mi) (Y\X)\ = 



\\y - h(i) *x t \\ 2 n 



and by the definition y £ T 



2o{i) 2 



| - log(P^^\y\x t )) H pHiMi) (Y\X)\ < n ln 



h(i) = argmin fe(i)eTin \\y - h(i) * 

<j(i) = argmin CT(i)eVn \a(i) - mm 1 

Note that this type concept does not apply to separate input 
or output sequences, only to pairs (x, y). 



Lemma 2 

For arbitrarily small S > 0, if n is large enough, there exists 
a set A C Tp, with P n (A) > 1 - S, where P is the n- 
dimensional standard normal distribution, such that for all 



3 



x £ A,k,l £ {0, 1, .. . ,l n } k ± I 

En 
j=0 x j-kXj-l 



< In 



En 
7=0 x j-kXj-k 



< In 



1 



lnp(x) - l/nH(P) 



< 7n 



(4) 

(5) 
(6) 



Proof: Take n i.i.d, standard Gaussian random variables 
X\, X%, . . . ,X n . Fix k,l 1 < k,l < n. By Chebishev's 
inequality, 



Pr 



From this, with £ = 7„n2 




Pr 



> 7W < 



= <5n 



Which means that, there exist a set in R" whose P n measure 
is at least 1 — S n and for all sequences from this set it is true 
that 

En 
j=0 x j~kXj~l 



n 



< In 



Similarly there exist such sets for all k ^ I in {0, 1, . . . , l n } x 
{0, 1, ... , l n }. By a completely analogous procedure we can 
make sets which satisfy [5] and [6] The intersection of these sets 
P-measure at least 1 — 2(5„(Z,^ + 1). As S n l^ — > 0, this proves 
the Lemma. ■ 
The Lebesgue measure will be denoted by A; its dimension is 
not specified, it will be always clear what it is. 

Lemma 3 

If n is large enough then the set A in Lemma [2] satisfies 

2 ff(P)-2„ 7 „ < < 2 H(P)+n Jn 

And for any m- dimensional continuous distribution Q(-) 

A(T Q ) < 2 H ^ +n ^ 

where Tq is the set of typical sequences to Q, see (0 
Proof: Since 

1 > P(A) = I p(x)X(dx) >1-S 

J A 

by the previous Lemma, by using 2~( ffp *~™ 7 ™- 1 > p(x) > 
2-(Hp x +nj n ) on x Px , and icT.we get 

2 H(P)+n ln ) > > (1 _ d)2 H(P)-n ln ) > 2 H(P)-2„ 7n ) 

if n is large enough. Similarly from 

1 > Q{T Q ) = [ q(x)X(dx) 

Jt q 

2 H(Q)-n ln > X ( Tq j 



The next lemma is an analogue of the Packing Lemma in 



Lemma 4 

For all R > 0, 8 > 0„ there exist at least 2"v R_l5 ) different 
sequences in W 1 which are elements of the set A from Lemma 
|2] and for each pair oflSI channels with h,h £ TL n , cr, a £ V„, 
and for all i £ {1, 2, . . . , M} 



A Tp^ixj) n \J T p $ i&) (Xj) < 



2 -[n(| Hh,&)-R\ + )-H h ,a(Y\X)] 



(7) 



provided that n > no(n, m, 5) 

Proof: We shall use the method of random selection. For 
fixed n, m constans , let C m be the family of all ordered col- 
lections C — {sij£2i ■ • ■ i£m}> °f m not necessarily different 
sequences in A. Notice that if some C — {x^x^, ■ ■ ■ >2L m } e 
C m satisfies (Q for every i and pair of Gaussian channels 
(h,a),(h,a), then Xi's are necessarily distinct. For any col- 
lection C £ C m , denote the left-hand side of © by Uj(C, h, h). 
Since for x&Tp 

\{T ph ,„ (x)} < 2»( H fc.<r(>1*)-7») 

from Lemma [3] a C £ C m satisfy ((TJ), if for all i,h,h 
Ui{C)= <C,h,h) 

h,heH 

, 2 n[I(h,CT)-fl]-H h(CT (r|X) 

is at most 1, for every z. 
Notice that, if C £ C m 

1 



E 

»=i 



«i(C) < 1/2 



(8) 



then Ui(C) < 1 for at least ^ i indices i. Further, if C 
is the subcollection, states the above indexes, then Ui(C) < 
Ui(C) < 1 for every such index i. Hence the Lemma will be 
proved, if for an m with 



2 . 2 n(R-s) <m< 2«(-R-4) 



(9) 



we find a C £ C m which satisfy [8] 

Choose C € C m at random, according to uniform distribu- 
tion from A. In other words, let W m = (Wi, W 2 , ... , W m ) 
be independent RV's, each uniformly distributed over A. In 
order to prove that [8] is true for some C £ C m , it suffices to 
show that 



Eu^" 1 ) < - i = 1, 2, . 



(10) 



To this end, we bound Eui(W m , h, h). Recalling that, 
Ui(C, h, h) denotes the left-hand side of [7] we have 

Eui(W m ,h,h) = (11) 

/ Pr{y £ T p k, {Wi) n U T t,a (Wj)} (12) 



[5] 
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As the Wj are independent identically distributed the proba- 
bility the integration is bounded above by 

]T Pr{y G 7>.„ (Wi) n T h,a (Wj)} = (13) 

JUT 4 ' 

(m - 1) • Pr{y £ 2>.. (W*)} • Pr{y G T y (W^)} (14) 

As the Wj's are uniformly distributed over A, we have for all 
fixed y G M™ 

A{x : x 6 Tp x , y G (V)} 
Pr{y G T ph ,„ (Z,)} = tj^t ^ 

The set in the enumerator is non-void only if y G T p h,<r. In this 
case it can be written as Tp x ^(y), where P is a conditional 
distribution, which 

Px(a)Pyf x m = PY a (k)Px\Y(a\b) 



Thus by Lemma [3] and Lemma [2] 



Pr{y G T ph . a (Zi)} < 



2H(X)-2n 7 „) 
2~ n(I(h,cr) — 37^ ^ 



If y G T p h,„, and Pr{y G T p h,„ (Wi)\ = otherwise. So, if 

— Y — ^Y\X 

we upper bound \(T p h,„) by 2 H '>> CT ( y ") + ™ 7 '* - with the use of 
Lemma [3] - from (fl4l . (fT2l and © we get, 

A(Tpf,, CT )(m - ^-"W'^H 1 ^)- 6 ^] 

< 2 -n[I(fe,ff)-fl+5-77n]+H h(CT (y|X) 

Let n be so large that 7 7 „ < 5/2, then we get 
E Ul (W rn ) < \Hl\\V*\2- n WV 

which proves (TToT > ■ 
Lemma 5 

For x G A from Lemma [2] and y as is (fJJ), and h = 

argmin^g^^-j \\y — h* x\\, and z = y — h * x, 



2^ 1 = 1 Z 3 X 3~ 



<7„ fcG{0..J n } 



Proof: (Indirect) Suppose that 

En 
j=l Z j X j-k 



Afc > 7„ 



for some k G {0 . . . Z n }. Then let 



J + In if j = k 

We will show, that || y— h* x^ || < Wy—h*^ j|, which contradicts 
to the definition of h). 



Now, 

n 

||y-/i*xj 2 =53(%- -Jl^^-f) 2 = 

j=l S =0 

= - ~ h 9 X 3-9 ~ InXj-kf = 

3 = 1 9=0 
n n 

= J2( Z 3 ~ ^Xj-k) 2 = ~ 2 lnZjXj- k + 7^|-fc) 

J=l J'=l 

On account of (0}, 

< IUJ 2 - 2n 7 „A fc + 7 „(n + 7n) < lllj 2 - (n - 7 nh„ 



Lemma 6 

Lef 5 > 0, ant/ £ G A from Lemma [2] Lef h, a G H x V ana" 
h°,a° G Ti. x V be two arbitrarily (ISI function, variance) 
pairs. Let y and x be such that 



Then 



& = argmin, iew(n) ||y - /i * x\\ 
(T 2 = ||y — h * x\\/n 



p Y °'x{y\x) < 2-"[ d (C l ' <T )ll( /i °> CTO ))-' 5 ]-^-( y l x ) 



(15) 
(16) 
(17) 



1 , a 2 + \\h-h° 



Here d((h, a)\\(h°, a")) = -^log(fj) - i + 
an information divergence for Gaussion distributions, positive 
if(h,a)^(h°,a°). 
Proof: 



P Y °^(y\x l ) = 2 



— - loe 



p"i" (yls) 



+log(P^^(y| £l ))] 



(18) 



and y G Ty\x(—) ^ tne definition, so: 

-logCP^d/i^)) ^ H /l , CT (y|x)-7 Ti > H h , CT (r|x) 



3 
(19) 



if n is large enough. With this: 



log 



' Py\x (yteiY 



log 



(211) 



exp(- 



\\y-h*x\\ 2 



exp( 



(20)7 0-1 

. , \\y-h*x\\ 2 \\y-h°*xf 

nl0g{ - ) + ^^ £ 

n a 2 . n + nj n \\y - h° * x\\ 2 
2 l0g( ^ ) + ^ " 2*** 



(20) 
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Introduce the following notation z = y — h*x, Then 
\\y-h°*x\\ 2 = \\z + h*x - h° *xf = 

n l n 
3=1 k=0 

E( z l + 2z i E0* - h t)xi-* + (J2( hk - h i)xj-k? 

i=l k=0 k=0 

= \\z\\+2j2(h k -h° k )^2z j x j - k + 

k=0 i=l 

n l n 

+E(E( fe *-^) s i-fc) 2 = 

3=1 fc=0 

using Lemma [5] 

n 

^ ZjXi,j-k < n ln 
3=0 



= ||z|| + 2 - /i£)n 7 „ + - ftD 2 E ; 



%3-/s" 



fe=0 



k=0 



3=0 



n l n 



+ E E ~ h °n)( h k - h° k )Xi,j-kXi,j-m > 

j—0 k^rn 

> \\z\\ - nAl n P n j n + (n - nj n )\\h - h°\\ 2 - n(l n P n ) 2 ln 
With this we can bound (T20l > 

-il 0g( A-i + ^+li fe - feQ li 2 -^-4LP 7 
2 8 V2 j 2 + 2* a 2 2 4t "^" 7 " 

- ln l4P 2 _ (; n P„) 2 7n = (21) 
while ||z|| = 7icT 2 . If ?7 large enough, then 

max(4/„P n7n ,4 7 J„P 2 , (7„P„) 2 7n ) < - 

since lim^oo P 2 l 2 j n — 0. Using © we continue from ( ETT i 

= d((M)||(fc°,0)-y 

- (Al n H n -/ n + jJ4H 2 + {l n H n ) 2 ln ) > 

>d((h,a)\\(h°,a°))-6/2 

Substituting this and $1% to ( TT~8b gives the desired result. ■ 
Now we can state, and prove our main theorem 

Theorem 1 For arbitrarily given R > e > 0, and 
blocklength n > hq (P, e), there exist a code (/, ip) ( cod- 
ing/decoding function pair), with rate > R — e such that for 
all ISI channels, with parameters h° G M. n , \h°\ < P n ,&° < 
P„, a° ^ 0, the average error probability satisfies 



Here 



P e (h°,a°,f,4>) < 2-< E ^ R ' h °' a °^ 



E r {R, h°,a°) = 



(22) 



where d((h, <j)\\h°, er°)) is the information divergence (|6|, 
Remark 1 

The expression minimised above is a continuous function of 
h°,a°,h,<7,R 

Proof: Let 6 = e/3, and let 

C = {Xj , X_2 , . . . , X M } 

the set of codesequences from Lemma [4] so M > 2 n ( R ~ s \ 
The coding function sends the z-th codeword for message i, 
f{i) = =*• 

The decoding happens as follows: Let denote the ISI-type 
°f V,2Li by h(i), <r(i) for all i G {1,2,..., M}. Using these 
parameters we define the decoding rule as follows 



0(2/) 



i = argmax- I(h(j),a(j)) 



in case of non-uniqueness, we declare an error. 
Now we bound the error 

1 M o o 

p e - Pom + ^ E p y\x (<i>(y) * O 



(24) 



where P out denotes the probability of event £ that the detected 
variance, for some ie {1,2,.... M} does not satisfy |er(i) 2 — 
— h ^*- 1 ^ < 7r[ . Bound the probability of this event. If £ 
occurs then <j(i) is extremal point of the approximating set of 
parameters, so 

\y ~ h(i) * xj| 2 > nP n . Since h = (0, 0, . . . , 0) is element of 
the approximating set of ISI, this means that the power of the 
incoming sequence is greater than nP n , the probability of this 



< 2 



1 



■ n l n a n {2n) n / 2 

= (2erfc(^))" « e-" 1+1/8 
a 

where erfc(-) the complement normal error function, this 
probability converges to faster that exponential, which - 
as we will see - means that in (124-b the second term is the 
dominant. 

Consider the second term from (l24l >. If we sent i then 
4>(y) 7^ i occurs if and only if 



l(h{j),a(j))>I(h(i),a(i)) 



We know that 



while we supposed that we are not in the event £. 
So the probability of the second term of ( f24b 



min{d(0, a)\\h°, a°)) + | I(h, a) - R\ + } (23) 



Py{x {<!>{y)ti\x i ,el) = 
E 

Y\X 



Py\x (y\xi)dy 
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With Lemma [6] (substituting (h,a) = (h(j),a(j)), (h,a) = 
(h(i),cr(i)), 5 = e/3) and from Lemma |4]we get 

^ -nWd((/ l (i),tr(i))||(/» o ,o- o ))+|I(h(3')^0"))--R| + -2e/3] 
Pe < h '" 

(h,o-)€(K n -V„) 2 

if n is large enough. From this - since the number of the 
approximating channel parameters grows subexponentially - 
we get 



IV. Numerical Result 

We compare the new error exponent with Gallager's error 
exponent. Gallager derived the method to send digital infor- 
mation through channel with continuous time and alphabet, 
with given channel function. This result can be easily modified 
to discrete time, as in e.g. [10], [8]. The Linear Gaussian 
channel with discrete time parameter, with fading vector h = 
(ho, hi, ... , /i;)can be formulated as follows: 



y = Hx + z 



(25) 



where x is the input vector and 



H 



ho 
hi 

hi 






ho 



hi-i 
hi 










hi-2 
hi-i 





(26) 



From [6] we get the idea to define a right and a left eigenbasis 
for the matrix H. So the right eigenbasis r lf r^, ■ ■ ■ ,r n , is not 
else then the eigenbasis of H T H where T means the transpose. 



And l^lz,. 



, l m is the left eigenbasis, where ^ = 



Mr 



(or 



basis of HH T ), and denoting Aj = \Hfj\. As in the work of 
Gallager r^r,- = = 1*1,-, because r, r s form an orthonormal 



-3' 



rM Mr i 

eigenbasis and IJ^ = |^gr^| 

So write x in a good basis r lr ..,r n we get x = Y^i=i 
- we know that a^-s form also an i.i.d. gaussian sequence. 
Write the output as y = J27=i Vih' we 8 et 



y% — ^iXi ~~\~ Z{ 



(27) 



for all i, where Zj is the white Gaussian noise in the basis 
of l 1 ,...,l m (in which is also white). In many works this 
equation is used as a channel with fading, where A^s are i.i.d. 
random variables. This is a false approach. If the receiver 
knows the ISI h then these constant can be computed. Is the 
ISI is a random vector from, e.g., i.i.d. random variables, then 
AiS are not necessarily i.i.d. 

If we interlace our codeword, with this formula we can get 
n' = n + I parallel channels, each have SNR X;J<j. We know 



that the error exponent given by Gallager for the z-th channel 

is 

1+p\ 

£ ; r(p,A l )--^ 7 ln / ( / q{x)p(y\x,\ i )y^dx 



If the input distribution q(x) is the optimal, Gaussian distri- 
bution, with use of [8] the above expression can be rewritten 

as 

I ,«' / ^ \~P 

E r (p, Aj) = — - 



n' ^— ' V c(l + p) 



Now we can use the Szego theorem from [1], and we get that 
the average of the exponent, so the exponent of the system is 

m 



E r (p) 



1 

2^ 



In 



1 



dx 



where f(x) = Y^^-oo^)^ hjh j+ \ k \ )e lkx , which is same 
as [10], [8] 

In the simulation we simulated a 4 dimensional fading vec- 
tor whose components was randomly generated with uniform 
random distribution in [0, 1]. For other randomly generated 
vectors, we get similar result. The two error exponent were 
positive in the same region, but for surprise the new error 
exponent was better (higher) than Gallager's one. 

Figure [TJ shows, that the new error exponent is always as 
good, or better than the Gallager's one. 



Difference 




Fig. 1 . Difference of the new error exponent and the Gallager's error exponent 

The new method gives better error exponent, however it can 
be hardly computed. We could estimate the difference only in 
4 dimension, because of the computational hardness to give 
the global optimum of a 4 dimensional function. 

V. Discussion 

Firstly our result can be used as a new lower bound to error 
exponents with no CSI at the transmitter. Note that, our error 
exponent (f23T > is positive if the rate is smaller than the capacity. 
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Secondly it gives a new idea for decoding incoming signals 
without any CSI: Maybe it is worth to perform a more difficult 
maximisation, but not dealing with channel estimation. This 
can be done because of the universality of the code, which 
means, the detection method doesn't depend on the channel. 

We have proved that if the ISI fullfills some criteria (see 
Theorem [TJ, then the message can be detected, with exponen- 
tially small error probability. However these criteria can be 
relaxed, because in the Theorem P n — > oo and l n — > oo, so any 
ISI with finite length and finite energy, and finite white noise 
variance, can be approximated well via the approximating set 
of parameters. 

It can be easily seen, that the lemmas and theorem remain 
true with small changes, if the input distribution is an arbitrar- 
ily chosen i.i.d. (absolute continuous or discrete) distribution. 
Only the functional form of the mutual information /(•), and 
the entropy of the output variable H/a(Y) changes. So, this 
result can be used for lower bounding the error exponent, if 
non-gaussian i.i.d. random variables are used for the random 
selection of the codebook. However in these cases the entropy 
of the output can hardly be expressed in closed form. 

With the result of Theorem Q] we can define channel 
capacity for compound fading channels. If the fading remains 
unchanged during the transmission, and the fading length 
satisfies l n << n, we can state the following theorem: 

Theorem 2 Let J- be an arbitrarily given not necessarily 
finite set of channel parameters, then the capacity of the ISI 
channel without any CSI, with channel parameter from T, is 

C(T) = inf I(h, a) 

Proof: In the limit of the set 7i n is dense in the 
space of the real sequences with any length, so for every 
(h,a) G T there exists a sequence (h n ,cr n ) € Ti. n V n such 
that (h n ,a n ) — > (h,a). We know from remark Q] that the 
error exponent in Theorem Q] is a continuous function, so the 
Theorem Q] proves that C(T) is an achievable rate. 

Given linear gaussian channel with ISI h and variance a the 
capacity is I(h, a) if the transmitter has no CSI. ■ 

For some (h, a) our error exponent gives a better numerical 
result, than the random coding error exponent derived by 
Gallager [6], improved by Kaplan and Shamai [8] (the random 
coding error exponent used here is deduced in Section HVb . 

This result is not so surprising, if we know that the 
Maximum Mutual Information (MMI) decoder gives better 
exponent in some cases (like in multiaccess environment) than 
the random coding error exponent derived by Gallager. 

This work doesn't contradict with [9]. We know, that in the 
discrete case the MMI decoder is not else than the generalised 
likelihood (GML) decoder [9], and also in [9] was showed, 
that GML decoder is not optimal in the non-memoryless case. 
However this is not the case in the continuous case, where the 
entropy of the incoming signal depends of the used parameter 
(h,a). 
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